Extensive-Form Perfect Equilibrium Computation in Two-Player Games
نویسندگان
چکیده
We study the problem of computing an Extensive-Form Perfect Equilibrium (EFPE) in 2-player games. This equilibrium concept refines the Nash equilibrium requiring resilience w.r.t. a specific vanishing perturbation (representing mistakes of the players at each decision node). The scientific challenge is intrinsic to the EFPE definition: it requires a perturbation over the agent form, but the agent form is computationally inefficient, due to the presence of highly nonlinear constraints. We show that the sequence form can be exploited in a nontrivial way and that, for general-sum games, finding an EFPE is equivalent to solving a suitably perturbed linear complementarity problem. We prove that Lemke’s algorithm can be applied, showing that computing an EFPE is PPAD-complete. In the notable case of zerosum games, the problem is in FP and can be solved by linear programming. Our algorithms also allow one to find a Nash equilibrium when players cannot perfectly control their moves, being subject to a given execution uncertainty, as is the case in most realistic physical settings. Introduction Computing solutions of games is currently one of the hottest problems in computer science, as providing optimal strategies to autonomous agents interacting strategically is central in Artificial Intelligence (Shoham and Leyton-Brown 2008). Finding a Nash Equilibrium (NE)—the basic solution concept for non-cooperative games—is PPAD-complete even in 2-player games (Chen, Deng, and Teng 2009) and it is unlikely that there is a polynomial-time algorithm, since it is commonly believed that FP ⊂ PPAD ⊂ FNP. We recall a search problem is in the PPAD class if there is a pathfollowing algorithm whose iterations have a polynomialtime cost. In the case of 2-player normal-form games, this algorithm is provided by (Lemke and Howson 1964). Extensive-form games provide a richer representation of strategic interaction situations w.r.t. the normal form. The study of extensive-form games is much more involved than that of normal-form games. A variation of Lemke-Howson’s algorithm, called Lemke’s algorithm, finds an NE in a 2player extensive-form game showing that the problem is in Copyright c © 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. the PPAD class (Koller, Megiddo, and von Stengel 1996). However, the concept of NE is not satisfactory in extensiveform games, and NE refinements are studied (Selten 1975). When information is perfect, the concept of Subgame Perfect Equilibrium (SPE) is satisfactory, while it is not when information is imperfect. In this latter case, refinements are usually based on the idea of perturbations representing mistakes of the players. In a Quasi-Perfect Equilibrium (QPE)— proposed by van Damme— a player maximizes their utility in each decision node taking into account only the future mistakes of the opponents, whereas, in an Extensive-Form Perfect Equilibrium (EFPE)—proposed by Nobel prized Selten—, players maximize their utility in each decision node keeping into account the future mistakes of both themselves and their opponents (Hillas and Kohlberg 2002). The sets of QPEs and EFPEs may be disjoint, requiring different techniques. Given a specific perturbation, computing a QPE is PPAD-complete (Miltersen and Sørensen 2010) and can be done by summing the perturbation to the constant terms in the linear constraints of the sequence form (von Stengel 1996); due to this reason, we say that this pertubation is additive. However, the problem of efficiently computing an EFPE is still open. The scientific challenge is intrinsic to the EFPE definition: it is based on a perturbation over the agent form, but the agent form is computationally inefficient, presenting highly nonlinear equilibrium constraints. The only previous attempt is (Gatti and Iuliano 2011), but no proof is provided about neither the soundness nor polynomial-time cost of each algorithm iteration (details are in the Supplemental Material). We show that finding an EFPE is PPAD-complete in 2player general-sum games and can be done by means of Lemke’s algorithm with an extra polynomial computation cost due to a numeric perturbation, and that it is in FP in 2player zero-sum games and can be done by linear programming with the same perturbation for the general-sum case. The table below summarizes the results known so far. ‘(∗)‘ denotes original contribution discussed in this paper. Solution concept General-sum Zero-sum Nash (NE) PPAD-complete FP Subgame Perfect (SPE) PPAD-complete FP Quasi Perfect (QPE) PPAD-complete FP Extensive-Form Perfect (EFPE) PPAD-complete (∗) FP (∗) In order to prove our main result, we provide also two original results of broader interest. First, we show that a perturbation over the agent form can be formulated as a specific symbolic perturbation over the coefficients of the variables of the sequence form (due to this reason, we say that this perturbation is multiplicative). This shows that computing an equilibrium when a player does not have perfect control over the execution of their moves along the game tree, as is customary for physical agents (e.g., robots) whose actions are subject to execution uncertainty, is PPAD-complete or in FP in general-sum and zero-sum games, respectively. Second, we show that we can turn the symbolically perturbed problem above into a numerically perturbed problem. We believe our approach to be particularly interesting, in that it not only applies to the computation of EFPEs, but rather is a more general framework, that can be used to derive, e.g., the results on QPEs in a more natural fashion. All omitted proofs can be found in the Supplemental Material. Preliminaries In the following, we adopt the notation introduced by (Shoham and Leyton-Brown 2008). We invite the reader unfamiliar with the topic to refer to (Shoham and Leyton-Brown 2008) or any other classic textbook on the subject for further information and context. An extensive-form game Γ is defined over a game tree. In each non-terminal node a single player moves and each edge corresponds to an action available to the player. As customary, N denotes the set of players, Ai denotes the set of actions available to player i and a is an action, a denotes the action profile of all the players and a−i denotes the action profile of the opponents of player i. Furthermore, Hi denotes the set of information sets of player i and h is an information set. Finally, ι(h) is the player that moves at h, ρ(h) is the set of actions available at h to player ι(h), and function ui returns the utility of player i from each terminal node. The agent form (Selten 1975) of an extensive-form game is a tabular representation in which, for every player i and information set h ∈ Hi, there is a fictitious player called agent and all the agents of player i have the same utility from the terminal nodes. Player i’s strategy over action a, called behavioral, is denoted by πi(a) ≥ 0 and is such that for each h it holds ∑ a∈ρ(h) πι(h)(a) = 1. The strategy of the agent playing at h is the restriction of πι(a) to actions ρ(h). A behavioral strategy profile is denoted by π. The concept of Extensive-Form Perfect Equilibrium (Selten 1975), also known as “Trembling hand perfect equilibrium”, is defined on the agent form. We initially introduce the definitions of perturbed game (over the agent form) and Nash equilibrium of the agent form since they are necessary to introduce the definition of EFPE. Definition 1. Let Γ be an extensive-form game and l(a) > 0 be a positive number called perturbation such that
منابع مشابه
Computing an Extensive-Form Perfect Equilibrium in Two-Player Games
Equilibrium computation in games is currently considered one of the most challenging issues in AI. In this paper, we provide, to the best of our knowledge, the first algorithm to compute a Selten’s extensive–form perfect equilibrium (EFPE) with two–player games. EFPE refines the Nash equilibrium requiring the equilibrium to be robust to slight perturbations of both players’ behavioral strategie...
متن کاملThe Computation of Perfect and Proper Equilibrium for Finite Games via Simulated Annealing
This paper exploits an analogy between the " trembles " that underlie the functioning of simulated annealing and the player " trembles " that underlie the Nash refinements known as perfect and proper equilibrium. This paper shows that this relationship can be used to provide a method for computing perfect and proper equilibria of n-player strategic games. This paper also shows, by example, that...
متن کاملFinding Equilibria in Games of No Chance
We consider finding maximin strategies and equilibria of explicitly given extensive form games with imperfect information but with no moves of chance. We show: 1. A maximin pure strategy for a two-player extensive form game with perfect recall and no moves of chance can be found in time linear in the size of the game tree. In contrast, it is known that this problem is NP-hard for games with cha...
متن کاملExtensive-Form Correlated Equilibrium: Definition and Computational Complexity
This paper defines the extensive form correlated equilibrium (EFCE) for extensive games with perfect recall. The EFCE concept extends Aumann’s strategic-form correlated equilibrium (CE). Before the game starts, a correlation device generates a move for each information set. This move is recommended to the player only when the player reaches the information set. In two-player perfect-recall exte...
متن کاملNo . 1955 On Forward Induction Srihari Govindan
We examine Hillas and Kohlberg’s conjecture that invariance to the addition of payoff-redundant strategies implies that a backward induction outcome survives deletion of strategies that are inferior replies to all equilibria with the same outcome. That is, invariance and backward induction imply forward induction. Although it suffices in simple games to interpret backward induction as a subgame...
متن کامل